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Abstract — Data  loss  in  wireless  sensing  applications  is  inevitable 
and  while  there  have  been  many  attempts  at  coping  with  this 
issue,  recent  developments  in  the  area  of  Compressive  Sensing 
(CS)  provide  a  new  and  attractive  perspective.  Since  many  physi¬ 
cal  signals  of  interest  are  known  to  be  sparse  or  compressible, 
employing  CS,  not  only  compresses  the  data  and  reduces  effective 
transmission  rate,  but  also  improves  the  robustness  of  the  system 
to  channel  erasures.  This  is  possible  because  reconstruction  algo¬ 
rithms  for  compressively  sampled  signals  are  not  hampered  by 
the  stochastic  nature  of  wireless  link  disturbances,  which  has 
traditionally  plagued  attempts  at  proactively  handling  the  effects 
of  these  errors.  In  this  paper,  we  propose  that  if  CS  is  employed 
for  source  compression,  then  CS  can  further  be  exploited  as  an 
application  layer  erasure  coding  strategy  for  recovering  missing 
data.  We  show  that  CS  erasure  encoding  (CSEC)  with  random 
sampling  is  efficient  for  handling  missing  data  in  erasure  chan¬ 
nels,  paralleling  the  performance  of  BCH  codes,  with  the  added 
benefit  of  graceful  degradation  of  the  reconstruction  error  even 
when  the  amount  of  missing  data  far  exceeds  the  designed  redun¬ 
dancy.  Further,  since  CSEC  is  equivalent  to  nominal  oversam¬ 
pling  in  the  incoherent  measurement  basis,  it  is  computationally 
cheaper  than  conventional  erasure  coding.  We  support  our  pro¬ 
posal  through  extensive  performance  studies. 
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I.  Introduction 

Data  loss  in  wireless  sensing  applications  is  inevitable,  ei¬ 
ther  due  to  exogenous  (such  as  transmission  medium  impedi¬ 
ments)  or  endogenous  (such  as  faulty  sensors)  causes.  While 
many  schemes  have  been  proposed  to  cope  with  this  issue,  the 
emerging  area  of  compressive  sensing  enables  a  fresh  perspec¬ 
tive  for  sensor  networks.  Many  physical  phenomena  are  com¬ 
pressible  in  a  known  domain  and  it  is  beneficial  to  use  some 
form  of  source  coding  or  compression,  whenever  practical,  to 
reduce  redundancy  in  the  data  prior  to  transmission.  For  exam¬ 
ple,  sounds  are  compactly  represented  in  the  frequency  domain 
whereas  images  may  be  compressed  in  the  wavelet  domain. 
Traditionally,  compression  is  performed  at  the  application  layer 
after  the  signal  is  sampled  and  digitized  and  typically  imposes 
a  high  computation  overhead  at  the  encoder.  This  cost  is  the 
major  reason  that  low-power  embedded  sensing  systems  have 
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to  make  a  judicious  choice  about  when  to  employ  source  cod¬ 
ing  [14].  Advances  in  compressive  sensing  (CS)  [5],  however, 
have  made  it  possible  to  shift  this  computation  burden  to  the 
decoder,  presumably  a  more  capable  data  sink  (e.g.,  a  wireless 
sensor  network’s  base  station),  which  is  neither  power  nor 
memory  bound.  CS  enables  source  compression  to  be  per¬ 
formed  inexpensively  at  the  encoder,  with  a  slight  sampling 
overhead1  and  with  little  or  no  knowledge  of  the  compression 
domain. 

Compression,  however,  also  makes  each  transmitted  bit  of 
information  more  precious,  necessitating  a  reliable  transport 
mechanism  to  maintain  the  quality  of  information.  To  cope 
with  channel  disturbances,  retransmission  schemes  have  popu¬ 
larly  been  applied,  but  they  are  inefficient  in  many  scenarios, 
such  as  on  acoustic  links  used  for  underwater  communication 
[1],  where  round  trip  delays  and  ARQ  traffic  cost  precious 
throughput.  Retransmissions  are  ineffective  in  other  cases  too, 
for  example,  in  multicast  transmissions  or  when  transmission 
latency  is  paramount  for  rapid  detection.  Forward  error  correc¬ 
tion  schemes  like  Reed-Solomon  [15],  LT  [12]  or  convolution¬ 
al  codes  are  better  suited  for  these  scenarios,  but  their  use  in 
low-power  sensing  has  been  limited,  primarily  because  of  their 
computational  complexity  or  bandwidth  overhead  [16]. 

Fortunately,  the  computational  benefits  of  CS  coupled  with 
its  inherent  use  of  randomness  can  make  it  an  attractive  choice 
for  combating  erasures  as  well.  A  key  observation  that  makes 
this  possible  is  that  reconstruction  algorithms  for  compressive¬ 
ly  sampled  data  exploit  randomness  within  the  measurement 
process.  Therefore,  the  stochastic  nature  of  wireless  link  losses 
and  short-term  sensor  malfunctions  do  not  hamper  the  perfor¬ 
mance  of  reconstruction  algorithms  at  the  decoder.  In  fact,  to 
the  decoder,  losses  are  indistinguishable  from  an  a  priori  lower 
sensing  rate.  We,  therefore,  propose  using  compressive  sensing 
as  a  low  encoding-cost,  proactive  erasure  coding  strategy  and 
show,  in  particular,  that  employing  CS  erasure  coding  (CSEC) 
has  three  desirable  features: 

•  CSEC  is  achieved  by  nominal  oversampling  in  an  incohe¬ 
rent  measurement  basis.  Compared  to  the  cost  of  conven¬ 
tional  erasure  coding  that  is  applied  over  the  entire  data  set 
from  scratch,  additional  sampling  can  be  much  cheaper, 
especially  if  random  sampling  is  used.  The  high  cost  of  CS 
decoding  is  amortized  over  joint  source  and  channel  cod¬ 
ing  and  is  free,  if  CS  was  already  being  employed  for 
source  decompression. 


CS  sampling  incurs  a  logarithmic  overhead  when  compared 
to  acquiring  the  signal  directly  in  the  compression  domain. 
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•  The  performance  of  CS  erasure  coding  with  random  sam¬ 
pling  is  similar  to  conventional  schemes  such  as  Reed- 
Solomon  and,  in  general,  the  BCH  family  of  codes,  in  that 
it  can  recover  as  many  missing  symbols  for  the  same  rela¬ 
tive  redundancy  in  a  memory  less  erasure  channel.  This  as¬ 
pect  is  covered  in  Sec.  II.D. 

•  CS  erasure  coding  is  robust  to  estimation  error  in  channel 
loss  probability.  For  example,  if  a  BCH  code  of  block  size 
n  were  designed  to  correct  up  to  t  erasures,  in  a  situation 
where  e  >  t  erasures  occur,  the  entire  block  of  n  symbols 
would  be  discarded.  This  implies  that  BCH  codes  must 
consider  and  be  designed  for  a  worst-case  loss  probability 
for  recovery  to  succeed.  An  equivalent  CS  strategy,  how¬ 
ever,  guarantees  that  even  if  e  >  t  symbols  are  lost,  the  best 
approximation  of  the  signal  is  reconstructed  from  the  re¬ 
maining  n-e  symbols.  This  means  that  even  if  channel 
coding  fails  at  the  physical  layer,  CSEC  can  recover  the 
signal  at  the  application  layer. 

Despite  its  advantages,  CS  erasure  coding  is  not  intended  as 
a  replacement  for  traditional  physical  layer  channel  codes.  It  is 
neither  as  general-purpose  (i.e.  it  cannot  be  used  for  arbitrary 
non-sparse  data),  nor  is  the  decoding  as  computationally  effi¬ 
cient  (yet).  Instead,  CSEC  should  be  considered  as  a  coding 
strategy  that  is  applied  at  the  application  layer,  where  it  utilizes 
knowledge  of  signal  characteristics  for  better  performance.  In 
this  regard,  it  is  the  reduced  encoding  cost  that  makes  CSEC 
especially  attractive  for  low-power  embedded  sensing.  We 
quantify  its  energy  efficiency  benefits  in  Sec.  III.C. 

We  highlight  the  conventional  and  proposed  approaches  in 
Figs.  2  and  1  respectively.  Notation  used  in  the  figures  is  intro¬ 
duced  in  the  Sec.  II.  Typically,  source  coding  is  performed 
after  the  signal  is  completely  acquired,  removing  redundancy 
in  the  samples  through  a  lossy  or  lossless  compression  routine. 
This  step  is  performed  at  the  application  layer  and  utilizes 
known  signal  models  to  determine  the  most  succinct  represen¬ 
tation  domain  for  the  phenomenon  of  interest.  The  compressed 
data  is  then  handed  to  the  communication  protocol  stack,  where 
just  before  transmission,  usually  at  the  physical  layer,  the  data 
may  be  encoded  again  to  introduce  a  controlled  amount  of  re¬ 
dundancy.  If  transmitted  symbols  are  received  in  error  or  not  at 
all,  the  decoder  may  be  able  to  recover  the  original  data  using 
this  extra  information. 

If,  on  the  other  hand,  compressive  sampling  were  to  be  em¬ 
ployed  for  joint  source  and  channel  coding,  the  sampling  stage 
would  itself  subsume  the  coding  blocks.  CS  sampling  uses  one 
of  a  variety  of  random  measurement  techniques  that  ensure  that 
sufficient  unique  information  is  captured  by  the  sampling 
process  with  high  probability.  We  propose  that  the  CS  sam¬ 
pling  block  should  be  designed  not  merely  to  include  prior 
knowledge  of  signal  characteristics  in  terms  of  its  sparsity  in  a 
specific  domain,  but  consider  channel  characteristics  as  well.  In 
particular,  we  propose  tuning  the  sampling  process,  e.g., 
through  judicious  oversampling,  to  improve  the  robustness  to 
channel  impairments.  We  show  in  Sec.  II.D  that  the  universali¬ 
ty  of  compressive  sensing  to  the  sparsity  domain  extends  to  the 
channel  model  as  well,  making  CSEC  advantageous  even  when 
channel  characteristics  are  not  precisely  known.  In  particular, 
we  will  see  in  Sec.  III.B  that  signal  reconstruction  performance 
with  CSEC  degrades  gracefully  when  the  average  sampling 
rate  at  the  acquisition  stage  is  insufficient  for  exact  recovery. 
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Figure  2.  The  conventional  sequence  of  source  and  channel  coding. 
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Figure  1.  Proposed  joint  source-channel  coding  using  compressive  sensing. 


II.  Compressive  Sensing  for  Erasure  Coding 

The  problem  we  seek  to  address  is  acquiring  a  length  n  sig¬ 
nal  vector  /  e  M”  at  a  sensor  node  and  communicating  a  length 
k  measurement  vector  zgR^  such  that  /  can  be  recovered 
accurately  at  a  base  station  one  or  more  wireless  hops  away. 
We  assume  a  generic  wireless  sensing  application,  where  the 
signals  are  sparse  or  compressible  in  a  known  domain,  and  the 
data  is  collected  centrally  at  a  capable  base-station.  To  con¬ 
struct  our  argument,  we  first  briefly  discuss  both  channel  cod¬ 
ing  and  compressive  sensing.  We  then  propose  a  compressive 
coding  strategy  in  which  oversampling  suffices  for  robust  data 
transmission. 

A.  Channel  Coding  Overview 

If  we  consider  a  simple  sense-and-send  scenario  where  we 
send  the  sensed  signal  /  e  M”  to  a  base  station  over  an  unrelia¬ 
ble  communication  channel,  z  =  /  and  k  =  n.  However,  if  a 
channel  coding  function  Fc  is  applied  prior  to  transmission, 
z  =  F  (/)  and  since  channel  coding  increases  the  average 
transmission  rate  by  adding  redundancy,  k  >  n .  Consider  a  li¬ 
near  channel  coding  function  z  =  Qf ,  where  Qe  'Kkxn  is  the 
equivalent  channel  coding  matrix.  When  z  is  transmitted 
through  a  lossy  channel,  some  measurements  may  not  be  re¬ 
ceived  at  the  other  end.  We  define  the  received  measurement 
vector  z  of  length  k'  =  k  —  e,  where  e  is  the  number  of  erasures. 
The  channel  may  also  be  modeled  as  a  linear  operator  CeM^ xk 
so  that  z  -  Cz.  In  general,  C  can  consist  of  any  values,  but  for 
the  class  of  erasure  channels  we  consider  here,  C  is  a  sub¬ 
matrix  of  an  identity  matrix  lk,  where  e  rows  have  been  omit¬ 
ted.  Recovering  the  original  signal  from  the  received  data  is 
then  a  decoding  operation  of  the  form: 

/  =  (  CQ)V,  (1) 

where,  V+  =  [XT  x)~'  XT  is  the  Moore-Penrose  pseudoinverse. 

If  CQ  is  full  rank,  the  decoding  will  be  successful,  else,  the 
signal  /  cannot  be  recovered  and  the  measurement  vector  z  is 
discarded.  Based  on  the  application,  the  encoder  may  either  re¬ 
send  z  or  may  re-encode  /  with  a  higher  redundancy  code  be¬ 
fore  retransmitting. 

We  would  like  to  emphasize  a  property  of  erasure  channels 
and  linear  coding  here.  Data  that  is  missing  from  vector  z  is 
caused  by  the  channel  matrix  C,  which  is  generated  by  omitting 


rows  from  an  identity  matrix  Ik  at  the  indices  corresponding  to 
the  missing  data.  But,  since  z  is  formed  using  z  =  Q /,  one  may 
instead  view  the  combined  coding  and  loss  process  as  one  of 
coding  alone,  where  Q'  =  CQ  is  the  equivalent  coding  matrix 
generated  from  Q  by  omitting  k-k'  of  its  rows  at  the  indices 
corresponding  to  the  lost  data.  This  means  that  missing  data  at 
the  receiver  can  be  considered  the  same  as  not  having  those 
rows  in  LI  to  begin  with.  We  will  use  this  perspective  later 
when  we  discuss  properties  of  compressive  sensing. 

Now,  if  we  knew  that  the  signal  /  contained  redundancy, 
we  could  have  compressed  it  before  channel  coding.  We 
represent  /  using  a  sparse  vector  xgM”,  by  transforming  it 
through  an  ortho-normal  basis  *F  e  using  /  =  *Fx.  For 
example,  if  /  was  an  acoustic  waveform  and  'F  was  the  inverse 
FFT  basis  operator,  x  would  be  the  Fourier  coefficients  of  the 
waveform.  In  the  traditional  (lossy)  source-channel  sequential 
coding  process,  the  largest  m,  where  m<^n,  coefficients  of  x 
would  be  passed  to  the  channel  encoder.  Let  y  =  Rx  be  the 
input  to  the  channel  coder,  where  R  e  Rmxn  is  sub-matrix  of  In 
that  defines  the  indices  of  x  selected  for  transmission.  The  out¬ 
put  at  the  sensor  node  would  then  be  z  =  Lly  =  QR'F-1/,  where 
£2  is  now  of  size  kxm.  At  the  receiving  end,  the  channel  de¬ 
coder  first  recovers  y  and  thus  x  using  (1)  (replacing  /  with  j>) 
and  then  /  =  x¥x. 

B.  Compressive  Sensing  Fundamentals 

The  theory  of  compressive  sensing  asserts  that  the  explicit 
compression  step  x  =  'F-1/  does  not  need  to  be  performed  at 
the  encoder  and  that  a  much  smaller  ‘incoherent’  transforma¬ 
tion  may  be  performed  instead.  We  consider  a  sensing  matrix 
O  e  Rmxn  that  generates  m(m<Kn)  of  these  incoherent  mea¬ 
surements  directly  by  projecting  the  signal  /  in  its  native  do¬ 
main2  through  y  =  0/.  In  the  usual  synchronous  sampling  re¬ 
gime,  O  is  an  identity  matrix  lm  and  m  =  n.  When  employing 
compressive  sensing,  however,  the  sensing  matrix  may  be  gen¬ 
erated  pseudo-randomly  using  various  statistical  distributions 
that  ensure  that  sufficient  unique  information  is  captured  with 
high  probability.  The  questions  that  CS  theory  answers  are: 
How  can  /  be  recovered  from  yl  How  many  measurements  m 
are  required  for  accurate  recovery  and  what  sensing  matrices  O 
facilitate  recovery?  We  summarize  some  key  results  from  [4] 
and  references  therein. 

The  foundational  argument  behind  compressive  sensing  is 
that  although  O  is  not  full  rank,  x  and  hence  /  can  be  ‘de¬ 
coded’  by  exploiting  the  sparsity  of  x  coupled  with  the  sparsity 
promoting  property  of  the  ix  norm.  To  accomplish  this,  we 
view  y  as  being  generated  through  y  -  OYx  =  Ax  instead  of 
through  y  -  O f .  Now,  while  there  are  infinitely  many  solutions 
to  y  =  Ax,  the  CS  reconstruction  procedure  selects  the  one  with 
the  least  sum  of  magnitudes  by  solving  a  constrained  tx  mini¬ 
mization  problem  [7] : 

x  =  argmin||x||^  s.t.  ym  Ax,  (2) 


2 

The  native  domain  for  typical  analog-to-digital  conversion  is 
time,  but  in  some  cases  like  photonic  ADCs  [3],  sampling  oc¬ 
curs  in  the  frequency  domain. 


where,  ||x||n  =  ^|xf|  and,  /  *s  recovered  using  /  =  *Fx  as  be- 

i- 1 

fore.  To  guarantee  that  the  solution  from  (2)  is  exact,  a  notion 
termed  the  restricted  isometry  property  (RIP)  was  introduced. 
We  will  return  to  the  RIP  shortly,  but  first  explain  how  the 
above  procedure  could  be  extended  to  handle  missing  data. 

C.  Handling  Data  Losses  Compressively 

Since  the  compressively  sampled  measurements  in  y  are  a 
compact  representation  of  /,  a  valid  scheme  to  protect  y  from 
channel  erasures  would  be  to  feed  it  to  a  channel  coding  block 
as  before.  Thus,  the  sensor  would  now  emit  the  coded  mea¬ 
surements  z  =  Lly  =  QO/,  with  kl  being  of  size  kxm  (k  >  m). 
At  the  receiver,  recovering  /  proceeds  by  first  recovering  y 
from  z  =  Cz  using  (1)  and  then  x  using  (2),  if  channel  decoding 
succeeds. 

If  we  consider  each  step  in  the  above  process,  we  see  that 
compressive  sampling  concentrated  the  signal  information  in  a 
set  of  m  measurements  and  then  channel  coding  dispersed  that 
information  to  a  larger  set  of  k  measurements.  A  natural  ques¬ 
tion  to  ask  is  how  the  dispersion  scheme  differs  in  essence 
from  the  concentration  scheme  and  whether  they  can  be  uni¬ 
fied.  The  answer  to  this  question  is  the  crux  of  this  paper. 

We  argue  that  compressive  sensing  not  only  concentrates 
but  also  spreads  information  across  the  m  measurements  ac¬ 
quired.  This  perspective  is  backed  by  Theorem  2  (below)  and  is 
the  primary  reason  for  the  logarithmic  rate  overhead  expe¬ 
rienced  by  CS  practitioners.  Based  on  this  observation,  we  pro¬ 
pose  that,  an  efficient  strategy  for  improving  the  robustness  of 
data  transmissions  is  to  augment  the  sensing  matrix  O  with  e 
additional  rows  generated  in  the  same  way  as  the  first  m  rows. 
These  extra  rows  constitute  extra  measurements,  which,  under 
channel  erasures  will  ensure  that  sufficient  information  is 
available  at  the  receiver.  Note  that  oversampling  in  the  native 
domain  of  /  is  also  a  valid  strategy,  but  is  highly  inefficient. 
On  the  other  hand,  we  will  show  next  that  if  k  =  m  +  e  incohe¬ 
rent  measurements  are  acquired  through  “compressive  over- 
sampling”  and  e  erasures  occur  in  the  channel,  the  CS  recovery 
performance  will  equal  that  of  the  original  sensing  matrix  with 
a  pristine  channel  with  high  probability  (w.h.p.).  We  denote  the 
augmented  sensing  matrix  as  O  e  Rkxn  and  the  samples  re¬ 
ceived  at  the  decoder  would  be  z  -  CO/.  The  decoding  and 
recovery  procedures  for  this  case  are  now  performed  in  one- 
step  using  (2),  but  constrained  by  z  (instead  of  y)_ to  incorpo¬ 
rate  augmentation  and  losses:  z  -  CO'Fx  =  CAx  =  A'x. 

To  understand  intuitively  why  such  an  approach  might 
work,  assume  that  both  O  and  O  are  generated  randomly  with 
each  element  being  an  instance  of  an  i.i.d.  random  variable. 
From  our  earlier  discussion  on  viewing  missing  measurements 
as  missing  rows  in  the  coding  matrix,  we  observe  that  with 
k  =  m  +  e  and  e  missing  measurements,  O'  =  CO  would  be  of 
size  mxn.  Now,  since  each  element  of  O  is  i.i.d.  and_the  era¬ 
sure  channel  does  not  modify  its  value,  we  can  view  O'  as  be¬ 
ing  generated  with  m  rows  to  begin  with,  just  like  O.  So,  while 
O  and  O'  will  not  be  identical,  their  CS  reconstruction  perfor¬ 
mance,  which  depends  on  their  statistical  properties  and  their 
size,  will  be  equal  (with  high  probability).  We  explain  this  ana¬ 
lytically  in  the  following  section. 


D.  Robustness  of  CSEC  to  Erasures 

We  can  show  that  CS  oversampling  is  not  only  a  valid  era¬ 
sure  coding  strategy,  but  also  an  efficient  one.  In  particular,  we 
would  like  to  show  that  if  we  augment  the  sensing  matrix  to 
include  e  extra  measurements  and  any  e  from  the  set  of 
k  =  m  +  e  measurements  are  lost  (randomly  and  independently) 
in  transmission,  the  performance  of  CS  reconstruction  is  equal 
to  that  of  the  original  un-augmented  sensing  matrix  (with  high 
probability).  To  accomplish  this,  we  rely  on  results  from  com¬ 
pressive  sensing  theory.  We  define  the  restricted  isometry  con¬ 
stant  Ss  of  a  matrix  A  =  and  reproduce  a  fundamental  re¬ 
sult  from  [4]  that  links  8s  to  CS  performance. 

Definition  1.  [4]  For  each  integer  s= 1,2,...,  define  the  isometry 
constant  8s  of  a  matrix  A  as  the  smallest  number  such  that: 


holds  for  all  ^-sparse  vectors  x.  A  vector  is  said  to  be  ^-sparse  if 
it  has  at  most  s  non-zero  entries. 

Theorem  2.  [4]  Assume  that  d2s  <  V2 -1  for  some  matrix  A, 
then  the  solution  x  to  (2)  obeys: 

|x-x|,,  <C0-||x-xs||fi  (4) 


Tli 


(5) 


for  some  small  positive  constant  C0  and  x5  is  an  approximation 
of  a  non- sparse  vector  with  only  its  ^-largest  entries.  In  particu¬ 
lar,  if  x  is  ^-sparse,  the  reconstruction  is  exact. 


This  theorem  not  only  guarantees  that  CS  reconstruction 
will  be  exact  if  S2s  (A)  <  v2  - 1  for  an  ^-sparse  signal,  but  also 
that  if  x  is  not  ^-sparse,  but  is  compressible,  with  a  power-law 
decay  in  its  coefficient  values,  tx  minimization  will  result  in  the 
best  ^-sparse  approximation  of  x,  returning  its  largest  s  coeffi¬ 
cients.  We  will  return  to  this  property  when  we  compare  the 
performance  of  CSEC  with  traditional  erasure  coding.  Note 
also,  that  this  is  a  deterministic  result. 


CS  theory  also  suggests  mechanisms  to  generate  A  matric¬ 
es  that  satisfy  the  RIP  with  high  probability.  For  example,  it 
has  been  shown  in  [7]  that  the  matrix  A  can  be  constructed 
randomly  using  i.i.d.  random  variables,  with  a  Gaussian 
Ay  =  AT(0, 1/ n)  distribution  or  an  equi-probable  Bernoulli 

Aij=±\/4n  distribution.  Using  such  matrices  in  low-power 
sensing  devices,  however,  is  difficult  since  implementing  the 
sensing  matrix  0  =  AT*-1  involves  sampling  and  buffering  / 
and  computing  y  =  0/  explicitly  through  complex  floating 
point  operations.  It  was  also  shown  in  [7]  and  [16]  that  if  A  is 
constructed  by  randomly  selecting  rows  of  a  Fourier  basis  ma¬ 
trix,  the  number  of  measurements  obeys: 


s  <C2 


m 

log4  (ft) 


(6) 


with  high  probability.  This  is  a  significant  result  indeed  be¬ 
cause  it  implies  that,  if  the  signal  is  sparse  in  the  Foureier  do¬ 
main,  0  =  AY-1  is  essentially  an  mxn  random  sampling  matrix 
constructed  by  selecting  m  rows  independently  and  uniformly 
from  an  identity  matrix  I77.  This  0  is  trivially  implemented  by 
pseudo-randomly  sampling  /,  m  times  and  communicating  the 
stream  of  samples  and  their  timestamps  to  the  fusion  center. 
Matrix  0  can  then  be  recreated  at  the  fusion  center  from  the 


timestamps.  The  limitation  on  being  the  Fourier  basis  was 
removed  in  [16],  which  showed  that  the  bound  in  (6)  extends  to 
any  dense  ortho-normal  basis  matrix  A*  with  uniform  random 
sampling. 

Assume  that  the  transmission  channel  can  be  modeled  us¬ 
ing  an  independent  Bernoulli  process  with  mean  loss  probabili¬ 
ty  p.  Thus,  the  likelihood  of  any  measurement  being  dropped 
in  this  memoryless  erasure  channel  is  equal  and  is  p.  To  show 
now  that  CSEC  with  random  sampling  is  efficient  for  this 
channel  model,  we  need  to  show  that  reconstruction  perfor¬ 
mance  with  A  =  0VP,  where  0  e  and  A'  =  COY,  where 
is  equal  with  high  probability  when  k  =  m/(\  —  p). 
The  factor  1  -  p  denotes  the  ratio  of  measurements  lost  in  the 
channel.  However,  note  that  since  'F  is  an  ortho-normal  basis 
matrix, _it  is  equivalent  to  show  that  sensing  performance  with 
O  and  0'  =  C0  is  identical  w.h.p. 

Our  approach  considers  the  Fourier  random  sampling  strat¬ 
egy,  which  constructs  a  0  matrix  by  selecting  samples  from  / 
at  random.  For  the  bound  in  (6)  to  hold  for  matrix  0,  two  con¬ 
ditions  need  to  be  met: 

a)  At  least  m  samples  need  to  be  selected 

b)  The  indices  should  be  selected  using  a  uniform  random 
distribution  so  that  each  sample  is  equi-probable. 

To  show  that  0'  results  in  identical  performance  (w.h.p.)  to 
0,  we  need  to  show  that  the  above  conditions  hold  equally.  We 
first  show  that  0'  satisfies  condition  (b).  When  the  channel  is 
memoryless  with  a  loss  probability  p ,  an  average  of  k-p  sam¬ 
ples  are  lost  in  transmission  and  only  k'  samples  are  received. 
Let  <S0  denote  the  set  of  sample  indices  that  were  selected  using 
y  =  0/  for  the  random  sampling  case.  Therefore,  the  cardinali¬ 
ty  of  S( ’0  would  be  |<S0 1  =  m.  Similarly,  let  the  set  of  indices  cho¬ 
sen  in  0  be  labeled  as  and  the  sample  indices  received  by 
the  decoder  be  Sc^  where  \S^  \  =  k  and  \SC^  \  =  k' .  Since  0  is  con¬ 
structed  randomly  and  uniformly,  the  probability  of  a  sample  i 
being  selected  from  /  is: 


Fife  S0  ||<S0|  =  »i]  =  —  (7) 

L  ““ft 

and  for  the  oversampling  case  is: 

Pr[/e55||^|  =  q  =  -  (8) 

L  J  ft 

Claim  3.  If  we  transmit  k  randomly  chosen  samples  over  an 
independent  Bernoulli  channel  with  a  probability  of  lost  trans¬ 
mission  over  the  channel  as  /?,  the  probability  of  the  ith  sample 
of  /  being  received  in  the  k'  samples  is: 


k' 


Pr[*e5C«l|5C8|=U=  — 

L  1  1  -1  n 

t  is  intuit 

Plte  V,-  ll5«l  =  /t']  = 


(9) 


Proof.  This  result  is  intuitive  and  straightforward  to  prove. 

pr[/e  >Sc5,|<Sc5|  =  k'\ 


Pr  [receiving  sample  i  correctly]-Pr  [| Sc^  \  =  k'  - 1] 

Mfe|  =  r] 

(1  -  /?)*Pr  [selecting  sample  z]-Pr  [\SC^  \  =  k'  - 1] 


(1  -  /O-Pr [i  e  Sq  |  \S^  \  =  k]  f,  1  •/"*'(!  -  p)*'"1 
_ _ \K  ~v _ 

,,  p^o-p)* 

J 

_(l-p)k/n  pk~k\\- pY~l  _k' 
k/k'  pk-k\\-pY  ~  n 

This  means  that,  if  the  channel  is  modeled  as  an  indepen¬ 
dent  Bernoulli  process  and  the  input  sample  distribution  is 
equiprobable  over  n  samples,  the  output  index  distribution  is 
also  equiprobable  over  the  set  of  correctly  received  samples. 
This  proves  condition  (b).  If  we  increase  the  number  of  sam¬ 
ples  by  the  ratio  lost  in  the  channel  such  that  k  =  m/(l- p), 
then  E  [k']  =  m  and  condition  (a)  is  satisfied  as  well: 

Pr  [/  e  5C$  |  |SC$  |  -  r]  =  Pr  [/  e  5*  |  \S9 1  -  m]  =  ^  (10) 

We,  therefore,  conclude  that  the  sample  indices  in  O'  are 
statistically  indistinguishable  from  the  indices  in  O  for  a  me¬ 
moryless  channel  and  that  the  bound  for  the  required  number  of 
measurements  holds  equally  (with  high  probability). 

E.  CSEC  Reconstruction  when  Redundancy  is  Insufficient 

While  the  above  result  indicates  that  signal  recovery  using 
CSEC  is  exact  if  the  redundancy  k-m  is  higher  than  the  num¬ 
ber  of  erasures  k  —  k\  it  can  also  be  shown  that  when  k'  <m, 
recovery  can  still  proceed  but  results  in  an  approximation  of 
the  signal.  This  is  in  contrast  to  traditional  erasure  coding 
schemes  that  necessitate  that  the  matrix  CQ  be  invertible  for 
any  reconstruction  to  occur.  To  prove  this  we  use  Thm.  2  when 
applied  to  compressible  signals.  Assume  that  the  signal  of  in¬ 
terest  /  in  its  compressed  form  x  has  its  ordered  coefficients 
decaying  according  to  a  power  law  such  that  |x|  <CrTr, 

where  Ixl  _  >|xl  >  •  •  •  >  Ixl  x  and  r  >  1. 

I  1(0)  I  1(1)  I  l(«) 

Assume  also  that  the  bound  in  (6)  is  satisfied  in  equality  for 
some  choice  of  m  and  s.  Now,  when  k'  <m,  k'  does  not  meet 
the  bound  for  sparsity  s.  However,  k'  is  guaranteed  to  satisfy 
the  bound  for  some  lower  sparsity  s'  <  s  (by  extension  from 
[16]).  For  the  set  of  k'  measurements  received,  Thm.  2  guaran¬ 
tees  that  CS  reconstruction  will  result  in  the  best  ^'-sparse  ap¬ 
proximation  of  x,  returning  its  largest  s'  coefficients.  This  im¬ 
plies,  then,  that  the  increase  in  the  Ix  norm  of  the  reconstruction 
error  with  k'  measurements  will  be  limited  to  e  <  Cjlx  -x  J 
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with  high  probability.  We  empirically  study  the  effect  of  this 
reconstruction  error  on  the  probability  of  recovery  in  Sec.  III.B. 

III.  Evaluation  Results 

In  order  to  verify  the  performance  of  compressive  erasure 
coding,  we  analyze  the  sampling  matrix  A'  that  identifies 
which  measurements  were  received  at  the  decoder. 

A.  Verifiable  Conditions  using  RIP 

1)  For  a  Memoryless  Erasure  Channel 

We  model  the  erasure  introduced  by  the  transmission  chan¬ 
nel  with  an  average  measurement  loss  probability,  p.  We  in¬ 
itially  assume  an  independent  Bernoulli  process.  The  question 
we  would  like  to  address  is  how  the  loss  of  k-p  measurements 
dropped  (on  the  average)  affects  CS  reconstruction.  From  Thm. 


2,  we  understand  that  reconstruction  accuracy  depends  on  the 
RIP  constant  8ls  of  A.  To  evaluate  the  extent  of  performance 
loss  through  the  erasure  channel,  we  can  thus  rely  upon  quanti¬ 
fying  <%(A).  Computing  <%(A)  exactly  from  Def.  1,  however,  is 
exhaustive  because  it  is  defined  over  all  ^-sparse  vectors.  We 
approximate  it  by  evaluating  the  eigenvalues  of  its  Grammian 
[2]  over  103  random  sxn  sub-matrices.  Increasing  this  number 
to  106  results  in  little  improvement. 

We  first  generate  a  random  sampling  matrix  A  =  O'P  of 
size  mxn  as  described  in  Sec.  II.B.  This  sensing  matrix  is  mod¬ 
ified  by  the  channel  so  that  A'  =  CA.  We  alscy  have  an  aug¬ 
mented  sensing  matrix  generated  at  the  sourceA  =  of  size 
kxn  with  k>  m  and  its  received  counterpart  A'  =  CA.  Testing 
the  performance  of _C SEC  numerically  then  proceeds  by  com¬ 
paring  whether  Ss  (A')  <  Ss  (A).  Equality  ensures  that  the  CS 
decoder  would  be  able  to  achieve  reconstruction  accuracy  iden¬ 
tical  to  anun-augmented  sensing  matrix  with  a  pristine  chan¬ 
nel.  If  8s  (A')  <  8s  (A),  it  means  that  the  decoder  has  more  mea¬ 
surements  through  A'  than  through  the  original  A  and  would 
lead  to  a  higher  probability  of  exact  recovery. 

The  result  from  this  calculation  for  a  Monte  Carlo  simula¬ 
tion  over  1000  256x1024  random  sampling  matrices  with  the 
Fourier  basis  for  reconstruction  is  shown  in  Fig.  3.  The  dotted 
blue  curve  labeled  “No  Loss”  indicates  <^(A)  forming  the 
baseline  for  our  comparison.  The  shading  illustrates  the  min- 
max  values  over  all  choices  of  O.  With  loss  probability 
p  =  0.2,  we  see  a  mean  increase  in  RIP  constant,  which  implies 
that  the  sparsity  for  guaranteed  1 1  reconstruction  drops.  In  this 
case,  we  see  that  the  sparsity  drops  from  about  6  to  4  based  on 
the  S2s  <  V2  — 1  bound  (gray  horizontal  line)  from  Thm.  2. 
Note,  though,  that  while  this  bound  is  known  to  be  conserva¬ 
tive,  enumerating  the  RIP  constant  in  this  way  clearly  indicates 
the  loss  in  reconstruction  accuracy  that  may  be  expected  by 
losing  20%  of  the  sampled  measurements. 

From  Clm.  3,  we  see  that  the  probability  distribution  of  in¬ 
dices  extracted  from  O  and  0'  =  C0  are  identical  when  C 
comes  from  a  memoryless  (independent  Bernoulli)  channel. 
This  means  that,  if  the  channel  is  not  congested,  increasing  the 
sensing  rate  by  a  factor  of  p/il-  p)  will  restore  the  delivery 
rate  to  k'  =  m  on  average.  The  effect  of  this  increase  is  substan¬ 
tiated  in  Fig.  3  and  establishes  <%(A')  ~  <^(A)  for  the  indepen¬ 
dent  Bernoulli  channel.  We  see  that  not  only  does  the  mean 
RIP  constant  improve  to  its  original  value  but  that  the  range  of 
variation  also  recovers  to  the  ‘^No  Loss”  baseline.  Note  also, 
that  the  minimum  values  of  8S  (A')  are  below  8S  (A)  suggesting 
that  some  instances  of  O  coupled  with  channel  loss  actually 
deliver  better-than-baseline  performance. 

To  compare  performance  across  different  sampling 
schemes,  we  further  evaluate  the  RIP  constant  for  a  sensing 
matrix  that  is  constructed  using  the  Gaussian  random  projec¬ 
tion  method  described  earlier.  In  this  case,  the  reconstruction  is 
performed  in  the  identity  domain  with  A  =  <D .  It  has  been 
shown  in  [7]  that  the  Gaussian  projection  technique  has  equiva¬ 
lent  performance  across  any  ortho-normal  reconstruction  basis 
and  the  identity  matrix  was  chosen  for  computational  ease.  The 
result  of  this  computation  is  shown  in  Fig.  4  for  a  memoryless 
lossy  channel  with  p  =  0.2.  Here  too,  we  observe  that  while  the 
RIP  constant  is  higher  in  the  lossy  case,  increasing  the  rate  by 


Figure  3.  RIP  constant  for  random  sampling  with  a  memoryless  channel. 


Figure  4.  RIP  constant  for  Gaussian  projections  with  a  memoryless  channel. 


the  amount  lost  in  the  channel  recovers  the  performance  guar¬ 
anteed  by  compressive  sensing. 

2)  Interleaving  for  Bursty  Channels 

Realistic  wireless  channels  exhibit  bursty  behavior  [17].  To 
estimate  the  effect  of  CSEC  performance  with  bursty  channels, 
we  use  the  popular  Gilbert-Elliott  (GE)  model  [10],  which  is 
both  tractable  to  use  and  accurate  in  describing  many  wireless 
channels  (including  those  in  mobile  environments  [18]).  GE 
channels  are  modeled  using  a  stationary  discrete-time  binary 
Markov  process.  Within  each  state,  marked  good  and  bad,  the 
probability  of  loss  is  assumed  to  be  pg  =  0  and  ph  =  1  respec¬ 
tively.  The  probability  of  transition  from  one  state  to  another  is 
marked  as  pgb  and  pbg.  To  maintain  consistency  with  the  me¬ 
moryless  channel  performance  studies,  we  compute  transition 
probabilities  from  average  loss  probability  p  and  expected 
burst  size  b  using  standard  relationships:  pbg=\lb  and 
pgh  =  p/(b(\  -  p)). 

Fig.  6  shows  the  RIP  constant  Ss  at  loss  probability  at 
p  =  0.2  and  with  an  expected  loss  burst  ofb  =  8  samples.  While 
b-  8  constitutes  an  extreme  condition  of  burstiness,  it  is  in¬ 
structive  to  see  its  effects  on  CS  recovery.  Thus,  the  same 
number  of  samples  is  delivered  to  the  fusion  center  as  with  the 
memoryless  channel  on  average,  but  with  a  modified  index 
distribution.  The  effect  of  this  change  is  immediately  evident  in 
the  variation  of  8S  in  Fig.  6  indicating  that  some  sensing  ma¬ 
trices  are  particularly  bad  for  a  GE  channel.  While  an  increase 
in  sensing  rate  improves  the  mean  RIP  constant  (though  not 
reaching  the  baseline),  the  variance  remains  quite  high. 

The  variance  issue  can  be  resolved  by  applying  randomized 
interleaving  prior  to  transmission,  which  results  in  a  roughly 
uniform  distribution  of  the  sample  losses  [13].  It  can  be  shown 
that  interleaving  recovers  the  original  index  distribution  (up  to 
a  bound)  for  random  sampling  and  the  green  dotted  curve 


Figure  6.  RIP  constant  for  random  sampling  with  a  Gilbert-Elliott  channel. 


Figure  5.  RIP  constant  for  Gaussian  projections  with  a  Gilbert-Elliott  channel. 


(coincident  with  the  baseline)  in  Fig.  6  illustrates  this  empiri¬ 
cally.  Note,  however,  that  interleaving  requires  buffering  y 
(though  not  /),  which  increases  decoding  latency.  Interesting¬ 
ly,  we  observe  that  the  Gaussian  random  projection  technique 
in  Fig.  5  is  unaffected  by  interleaving.  In  fact,  using  Gaussian 
projections  delivers  near  baseline  performance  with  or  without 
interleaving  (min-max  variation  is  not  perfect).  This  is  because 
the  sensing  matrix  O  is  dense  (compared  to  the  one  for  random 
sampling)  with  each  element  within  it  being  i.i.d  Gaussian. 
This  means  that  every  measurement  in  y  has  a  random,  inde¬ 
pendent  but  statistically  identical  contribution  from  every  ele¬ 
ment  in  /.  Since  interleaving  the  measurements  y  is  equivalent 
to  shuffling  the  rows  of  the  matrix  O,  interleaving  does  not 
affect  the  statistical  properties  of  O . 

A  more  extensive  treatment  of  this  “democratic”  property 
of  Gaussian  sensing  matrices  can  be  found  in  a  recent  report  by 
Davenport,  et  al.  [24],  which  analyses  the  performance  from  a 
theoretical  perspective.  The  democracy  argument  has  been 
employed  in  [25]  for  the  novel  application  of  handling  satura¬ 
tion  errors  in  analog-to-digital  quantizers,  in  a  fashion  similar 
to  what  we  propose  in  CSEC. 

B.  Signal  Reconstruction  Performance 

Evaluating  the  RIP  constant  provides  theoretical  insight  in¬ 
to  what  the  performance  gain  would  be  when  using  CSEC.  In 
this  section,  we  study  the  practical  implications  by  evaluating 
the  probability  Pex  with  which  CSEC  could  deliver  the  original 
signal  exactly.  We  do  this  by  performing  a  Monte-Carlo  simu¬ 
lation  over  104  random  instances  of  a  length  256  sparse  signal 
and  computing  how  often  CS  erasure  coding  results  in  exact 
recovery.  Figs.  7  to  1 1  illustrate  the  comparative  performance 
of  using  Fourier  random  sampling  (left)  and  the  Gaussian  pro¬ 
jection  method  (right)  for  CSEC.  For  each  plot,  the  “No  Loss” 
curve  indicates  the  baseline  with  m  =  64  (4:1  compression)  and 
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Figure  7.  Probablity  of  recovery  for  random  sampling  with  a  memoryless 
erasure  channel. 


Figure  8.  Probablity  of  recovery  for  Gaussian  projections  with  a  memoryless 
erasure  channel. 


the  “Loss”  curve  indicates  the  probability  when  no  over¬ 
sampling  is  performed.  The  “CSEC”  (red)  curve  indicates  Pex 
with  compressive  oversampling  at  k  =  m/(l-  p)  and  the  two 
“CS”  curves  indicate  intermediate  values  with 
m  <  k  <  m/(l  —  p).  The  x-axis  indicates  the  number  of  non-zero 
coefficients  in  x. 

Three  channel  models  have  been  used  to  generate  these 
figures.  Figs.  7  and  8  mimic  the  channel  model  used  in  Figs.  3 
and  4,  a  memoryless  erasure  channel  modeled  as  an  indepen¬ 
dent  Bernoulli  process  with  p  =  0.2.  We  see  for  both  Fourier 
random  sampling  and  Gaussian  projections  that,  when 
k-k'  =  bp  =  16  measurements  are  lost  on  the  average, 
k  =  m/(l-  p)  =80  recovers  performance  to  the  original  m  =  64 
level.  Observe  that  if  the  bound  in  (6)  is  not  met  (beyond  about 
s  =  10),  the  performance  for  a  particular  k  drops  gradually  with 
5.  Note,  also,  that  Pex  decays  quicker  to  0  in  the  case  of  Gaus¬ 
sian  projections  and,  we  see  while  comparing  with  Figs.  3  and 
4,  this  is  because  the  RIP  constant  is  also  higher  for  the  latter. 
The  intermediate  values  of  k  (k  =  69  and  k  =  75)  in  both  cases 
deliver  intermediate  levels  of  quality  as  predicted  in  Sec.  II.E. 

Figs.  9  and  10  use  the  same  Gilbert-Elliott  channel  model 
as  Figs.  6  and  5  with  p  =  0.2  and  b  =  8.  It  is  striking  to  note  that 
due  to  the  burstiness  of  the  channel,  the  performance  of  neither 
Fourier  random  sampling  nor  Gaussian  projections  reaches  the 
baseline  for  low  sparsity  levels.  Further,  while  the  highest  s  for 
which  Pex  =  1  has  gone  down  substantially  for  the  lossy  scena¬ 
rios,  the  slope  of  the  Pex  curve  is  also  reduced.  The  reason  for 
this  is  that  the  distribution  of  received  sample  lengths,  k\ 
across  the  Monte-Carlo  runs  is  skewed  and  asymmetric  about 
the  mean  for  bursty  channels,  whereas  it  is  symmetric  about 
k(  1  -  p)  and  is  Gaussian  for  a  memoryless  channel.  As  an  ex- 


Figure  9.  Probablity  of  recovery  for  random  sampling  with  a  Gilbert-Elliott 
channel  model. 


Figure  10.  Probablity  of  recovery  for  Gaussian  projections  with  a  Gilbert- 


Elliott  channel  model. 


Figure  1 1 .  Probablity  of  recovery  for  random  sampling  with  a  real  802.15.4 
channel  using  GE  model. 

ample,  the  mean  value  of  k'  for  CSEC  across  runs  is  pk  ~  64 
for  both  the  Bernoulli  channel  and  the  GE  channel  but  their 
standard  deviations  are  (Jk'Bern  ~  3  and  <Jk'GE  ~  10  respectively. 
It’s  interesting,  though;  the  sample  length  distribution  is 
skewed  toward  higher  k'  and  the  result  is  that  the  probability  of 
reconstruction  at  larger  sparsity  levels  is  actually  higher  than 
the  baseline.  An  unexpected  result  from  Fig.  9  is  that  interleav¬ 
ing  makes  little  or  no  difference  to  Fourier  random  sampling. 

Fig.  11  has  been  generated  using  a  wireless  network  trace 
from  the  CRAWDAD  database  [11],  which  provides  extensive 
network  performance  datasets  collected  from  a  wide  array  of 
conditions.  The  particular  trace  we  selected  used  sensor  nodes 
with  an  IEEE  802.15.4  radio  transceiver  placed  about  12m 
apart  between  two  different  floors  of  a  university  building.  This 
trace  had  the  highest  loss  probability  and  burstiness  across  the 
27  traces  collected  with  p  ~  0.15  and  b  =  1.2.  We  built  a  GE 
channel  model  based  off  the  trace  and  simulated  the  probability 
of  exact  recovery  as  before.  There  is  very  little  burstiness  in  the 


Figure  12.  Comparison  between  received  sample  length  distributions  resulting 
from  transmission  through  memoryless  and  Gilbert-Elliott  channels. 


m=256  m=10  m=64  k=320  k=16  k=80 

S-n-S  C-n-S  CS  S-n-S+RS  C-n-S+RS  CSEC 


Figure  13.  Energy  consumption  comparison  for  different  sampling  strategies 
(form  left  to  right:  Sample-and-Send  (S-n-S),  Compress-and-Send  (C-n-S), 
Compressive  Sensing  (CS),  S-n-S  with  Reed-Solomon  encoding,  C-n-S  with 

Reed-Solomon  encoding  and  CS  Erasure  Coding  by  Oversampling). 

channel  and  Fig.  11  shows  that  CSEC  will  be  able  to  deliver 
near  baseline  performance  with  either  Fourier  random  sam¬ 
pling  or  Gaussian  projections  (the  plot  for  Gaussian  projections 
was  omitted  since  it  was  identical  to  Fig.  8). 

C.  CSEC  Implementation  Costs 

We  can  quantify  the  energy  efficiency  gains  that  CS  prom¬ 
ises  too.  In  particular,  we  use  random  sampling  with  n  =  256 
and  compare  it  to  two  cases  -  first,  where  a  standard  (255,223) 
Reed-Solomon  (RS)  [15]  code,  a  popular  BCH  code,  is  applied 
to  a  set  of  256  raw  16-bit  samples  and  second,  where  RS  is 
applied  to  a  compressed  version  of  the  signal.  We  assume  the 
signal  is  sparse  (s<10)  in  the  Fourier  domain  and  use  256- 
point  FFT  for  source  compression.  Fig.  13  shows  this  compari¬ 
son,  which  also  includes  energy  consumption  costs  without  RS. 
The  data  has  been  extracted  using  a  cycle  and  energy  accurate 
instruction-level  simulator  [21]  available  for  the  popular  MicaZ 
sensor  platform.  While  the  analysis  is  specific  to  this  platform, 
the  insight  from  these  results  can  be  applied  more  generally. 

We  have  split  the  costs  among  five  blocks,  which  are  sig¬ 
nificant  for  the  comparison  -  random  number  generator  (for 
CS),  ADC,  FFT  processing,  radio  transmission  and  Reed- 
Solomon  coding.  The  total  energy  consumption  of  sample-and- 
send  and  compress-and-send  is  almost  equal  without  RS,  with 
the  radio  taking  a  large  chunk  of  the  former  and  the  FFT  rou¬ 
tine  consuming  half  of  the  latter.  Notice  also,  that  the  ADC 
energy  consumption  is  substantial  since  both  these  techniques 
need  to  operate  on  the  entire  signal  vector.  On  the  other  hand, 
the  CS  routine  at  m  =  64  (4:1  compression)  requires  a  fraction 
of  the  ADC  and  radio,  but  incurs  an  overhead  for  generating 
random  numbers.  The  current  implementation  uses  an  inexpen¬ 
sive  16-bit  LFSR  for  pseudo-random  number  generation. 


When  the  data  is  RS  encoded  before  transmission,  the  ener¬ 
gy  consumption  of  the  sample-and-send  strategy  jumps  consi¬ 
derably,  whereas  the  increase  for  compress-and-send  is  neglig¬ 
ible,  because  it  is  sending  at  most  10  coded  symbols  (with  6 
parity  symbols).  We  chose  s  <  10  since  that  is  the  threshold 
below  which  m  =  64  in  the  lossless  case  and  k  =  80  for  a  me¬ 
moryless  erasure  channel  result  in  exact  CS  recovery  (refer 
Figs.  7  and  11).  This  means  that,  with  a  p  =  0.2  memoryless 
channel,  all  three  strategies  on  the  right  will  deliver  equivalent 
recovery  performance.  When  comparing  encoding  cost,  how¬ 
ever,  CS  erasure  coding  is  2.5  x  better  than  performing  local 
source  compression  and  3x  better  than  sending  raw  samples. 

IV.  Related  Work  and  Discussion 

Recovering  from  erroneous  and  missing  data  in  communi¬ 
cation  systems  employ  ARQ  retransmissions  and  forward  error 
correction  (FEC)  routinely,  sometimes  simultaneously,  at  dif¬ 
ferent  layers  of  the  communication  protocol  stack.  For  sensor 
networks,  however,  the  simplicity  of  ARQ  has  retained  it  as  the 
dominant  form  of  error  recovery.  Many  researchers  have  ques¬ 
tioned  this  recently  and  evaluated  FEC  techniques  through  lab 
experiments. 

For  example,  Schmidt,  et.  al.  [20]  focused  on  convolutional 
coding  to  show  that  a  modified  Turbo  code  is  quite  feasible  on 
a  MicaZ  platform.  While  they  report  that  the  energy  consump¬ 
tion  using  Turbo  codes  is  about  3.5 x  of  its  un-coded  counter¬ 
part,  the  overall  energy  efficiency  is  better  considering  re¬ 
transmission  costs,  especially  on  high  loss  links.  They  also 
show  that  the  computational  complexity  of  Turbo  encoding  is 
practical,  but  only  with  low-rate  data  transfers  (they  tested  with 
one  packet  every  second).  Jeong,  et.  al  [19]  proposed  using  a 
simpler  error  correcting  code  for  only  single  and  double-bit 
errors  to  reduce  this  computation  burden.  They  illustrate, 
through  experimental  data  that  long  error  bursts  are  rare  in  stat¬ 
ic  sensor  networks  and  argue  that  the  complexity  of  RS  or  LT 
codes  is  unwarranted.  They  show  that  their  error  correcting 
code  reduces  the  packet  drop  rate  almost  to  zero  for  outdoor 
deployments.  However,  due  to  a  higher  frequency  of  multiple- 
bit  errors  indoors,  recovery  remains  imperfect  there.  We’ve 
shown  that  CSEC  can  provide  both  computational  benefits  and 
recovery  performance  that  parallels  state-of-the-art  erasure 
correcting  codes.  To  use  CSEC,  however,  one  must  have  a 
good  understanding  of  the  physical  phenomena  being  acquired 
and  the  domain  it  can  be  compressed  in. 

Further,  Wood,  et.  al.  [22]  recently  reported  the  use  of  on¬ 
line  codes  in  low-power  sensor  networks.  Online  codes  are  a 
form  of  digital  fountain  codes,  such  as  the  LT  codes  [12],  but 
are  simpler  to  encode  and  decode.  They  propose  a  lightweight 
feedback  mechanism  that  allows  the  encoder  to  cope  with  vari¬ 
ations  in  link  quality  rapidly  and  efficiently.  They  point  out, 
however,  that  multiple  parameters  need  to  be  tuned  in  order  for 
the  coding  to  be  efficient.  This  is  similar  to  the  sensitivity  of 
LT  codes  to  the  degree  distribution  [12].  While  our  current 
work  has  focused  on  block  wise  decoding  and  makes  analogies 
to  other  linear  block  coding  strategies  such  as  BCH  codes, 
CSEC  can  be  used  in  a  “rateless”  mode  as  well,  similar  to 
fountain  codes.  Asif,  et  al.  [23]  demonstrate  a  way  of  streaming 
incoherent  measurements  and  describe  a  homotopy  based  ap¬ 
proach  that  performs  iterative  decoding  as  measurements  are 
received.  Using  this  technique,  while  exploiting  the  democracy 
property  of  sensing  matrices  [24],  a  stream  of  measurements 
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Figure  14.  Probablity  of  recovery  for  random  sampling  with  a  memoryless 
channel  and  8  measurements  per  packet. 

that  is  transmitted  through  an  erasure  channel  can  be  progres¬ 
sively  decoded  as  measurements  are  gradually  received. 

We  have  described  CSEC  for  handling  erasures  in  a  chan¬ 
nel,  but  CSEC  can  be  extended  to  correct  for  errors  in  the  sen¬ 
sor  transduction  process  too.  This  means  that  a  controlled 
amount  of  sensor  noise  can  be  cleaned  from  the  acquired  mea¬ 
surements  during  the  decompression  process.  This  is  achieved 
by  using  Basis  Pursuit  De-noising  [8],  which  changes  the 
equality  constraint  in  (2)  to  an  inequality  to  account  for  varia¬ 
tions  due  to  noise.  Note,  however,  that  since  CSEC  utilizes 
features  of  the  physical  phenomenon  and  operates  on  the  ac¬ 
quired  signal,  and  not  on  the  modulated  symbols  transmitted 
through  the  wireless  channel,  CSEC  is  not  useful  for  correcting 
symbol  errors  at  a  communication  receiver.  A  better  approach 
to  tackling  the  latter  using  ix  minimization  techniques  is  dis¬ 
cussed  by  Candes  and  Tao  in  [6]. 

In  Sec.  III. A,  we  used  the  RIP  constant  of  the  sensing  ma¬ 
trix  as  way  of  verifying  its  reconstruction  performance.  Anoth¬ 
er  technique  that  was  recently  proposed,  namely  the  null-space 
property  [9],  could  also  have  been  used.  Until  much  recently, 
however,  the  null- space  property  was  as  difficult  to  compute  as 
the  RIP  constant.  In  the  future,  we  will  not  only  use  the  null- 
space  property  for  evaluating  sensing  matrices  numerically  but 
also  study  its  use  to  analyze  the  recovery  properties  of  CSEC, 
especially  with  Gaussian  projections. 

And  finally,  we  note  that  the  evaluation  studies  in  Sec.  Ill 
assumed  that  measurements  are  streamed  to  the  receiver  as 
they  are  acquired.  If  one  packetizes  the  measurements  for 
transmission,  in  a  memoryless  channel,  the  sample  losses  will 
no  longer  be  independent  and  instead  show  high  burstiness.  An 
example  of  this  is  shown  in  Fig.  14,  which  shows  the  probabili¬ 
ty  of  recovery  when  8  measurements  are  transmitted  in  every 
packet.  We  defer  a  detailed  study  of  the  effects  of  packetization 
in  realistic  wireless  channels  for  future  work. 

V.  Conclusion 

We  have  explored  the  application  of  Compressive  Sensing 
to  handling  data  loss  from  erasure  channels  by  viewing  it  as  a 
low  encoding-cost,  proactive,  erasure  correction  scheme.  We 
showed  that  CS  erasure  coding  is  efficient  when  the  channel  is 
memoryless  and  employed  the  RIP  to  illustrate,  that  even  ex¬ 
treme  stochasticity  in  losses  can  be  handled  cheaply  and  effec¬ 
tively.  We  showed  that  for  the  Fourier  random  sampling 
scheme,  oversampling  is  much  less  expensive  than  competing 
erasure  coding  methods  and  performs  just  as  well.  This  makes 
it  an  attractive  choice  for  low-power  embedded  sensing  where 
forward  erasure  correction  is  needed. 
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